Skip to content

feat: Agent Teams orchestration — /team skill + teammate-aware preamble#105

Open
HMAKT99 wants to merge 2 commits intogarrytan:mainfrom
HMAKT99:arun/agent-teams
Open

feat: Agent Teams orchestration — /team skill + teammate-aware preamble#105
HMAKT99 wants to merge 2 commits intogarrytan:mainfrom
HMAKT99:arun/agent-teams

Conversation

@HMAKT99
Copy link

@HMAKT99 HMAKT99 commented Mar 16, 2026

The problem

gstack has 11 specialized skills, but you can only use one at a time. When you need security AND risk AND code review on the same PR, you run them sequentially, copy-paste findings between sessions, and synthesize the output yourself. You are the message bus.

With Claude Code Agent Teams (experimental, v2.1.32+), teammates can message each other directly. This PR makes every gstack skill a first-class Agent Team teammate.

What this adds

Every skill now works in two modes — automatically

The {{PREAMBLE}} (shared by all skills) now detects if it's running as a teammate:

Standalone (user invokes directly):
  /cso → outputs findings to conversation

As teammate (lead spawns in a team):
  /cso → messages findings to /risk teammate
       → /risk incorporates into risk register
       → /risk messages to /board teammate
       → /board synthesizes into executive brief

No per-skill changes needed. The preamble handles mode detection, communication protocol, task claiming, and teammate discovery.

/team — 7 pre-built team configurations

Command What happens Teammates
/team ship Plan → Review + Security (parallel) → Ship → QA 4
/team review Multi-lens code review (engineer, security, risk, perf) 4
/team launch Media + PR + Internal comms create aligned content 3
/team incident Escalation IC + Security + Comms war room 3
/team diligence VC + CFO + CSO → Risk → Board synthesis 5
/team audit Security + Risk + Finance compliance 3
/team custom Any combination of gstack skills N

How /team diligence actually works

Lead spawns 5 teammates, each reads their SKILL.md:

  "vc"   reads /vc/SKILL.md    → moat analysis, velocity scorecard
  "cfo"  reads /cfo/SKILL.md   → cost model, build-vs-buy
  "cso"  reads /cso/SKILL.md   → OWASP audit, STRIDE model

  vc, cfo, cso work in parallel (no dependencies)

  "risk" reads /risk/SKILL.md  → waits for cso findings
                                → incorporates security into risk register
                                → messages risk register to board

  "board" reads /board/SKILL.md → waits for vc + cfo + risk
                                 → synthesizes into 2-page executive brief

Teammates message each other directly. The lead synthesizes final output.

team/TEAMS.md — coordination reference

Every teammate reads this to understand:

  • Message format: FROM, STATUS, TOP FINDINGS, FULL REPORT, ACTION NEEDED
  • Dependency graph: ASCII diagram showing who messages whom
  • Shared state: .gstack/ directory map for cross-teammate file access
  • Anti-patterns: don't edit same files, don't broadcast everything, don't skip the lead

What's NOT in this PR

Integration

  • Preamble change regenerates all 12 existing SKILL.md files (additive only)
  • /team skill follows standard .tmplSKILL.md pipeline
  • CLAUDE.md updated with Agent Teams section and dependency graph
  • LLM-as-Judge evals for orchestration quality, TEAMS.md clarity, preamble actionability, spawn prompt sufficiency

Test plan

  • bun test — 114 pass, 0 fail (723 assertions)
  • bun run gen:skill-docs --dry-run — all 12 SKILL.md files FRESH
  • Preamble teammate awareness identical across all skills (single source of truth)
  • EVALS=1 bun test test/agent-teams-llm-eval.test.ts — LLM quality evals (~$0.08/run)

Makes every gstack skill work as a Claude Code Agent Team teammate:

1. Preamble: _IS_TEAMMATE detection, communication protocol, task claiming
2. /team skill: 7 pre-built team configurations (ship, review, launch, incident, diligence, audit, custom)
3. team/TEAMS.md: coordination reference with dependency graph and message formats
4. CLAUDE.md: Agent Teams documentation section

Requires: CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 (user opts in)
Tier 3 evals (~$0.08/run) using Claude Sonnet as judge:
- /team SKILL.md orchestration quality (spawn prompts, team configs)
- TEAMS.md coordination reference quality (message formats, state locations)
- Preamble teammate awareness actionability (mode switching)
- Spawn prompt context sufficiency (enough info for autonomous teammates)

Run: EVALS=1 bun test test/agent-teams-llm-eval.test.ts
Requires: ANTHROPIC_API_KEY
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant